Boosting k-nearest neighbor classifier by means of input space projection

نویسندگان

  • Nicolás García-Pedrajas
  • Domingo Ortiz-Boyer
چکیده

The k-nearest neighbors classifier is one of the most widely used methods of classification due to several interesting features, such as good generalization and easy implementation. Although simple, it is usually able to match, and even beat, more sophisticated and complex methods. However, no successful method has been reported so far to apply boosting to k-NN. As boosting methods have proved very effective in improving the generalization capabilities of many classification algorithms, proposing an appropriate application of boosting to k-nearest neighbors is of great interest. Ensemble methods rely on the instability of the classifiers to improve their performance, as k-NN is fairly stable with respect to resampling, these methods fail in their attempt to improve the performance of k-NN classifier. On the other hand, k-NN is very sensitive to input selection. In this way, ensembles based on subspace methods are able to improve the performance of single k-NN classifiers. In this paper we make use of the sensitivity of k-NN to input space for developing two methods for boosting k-NN. The two approaches modify the view of the data that each classifier receives so that the accurate classification of difficult instances is favored. The two approaches are compared with the classifier alone and bagging and random subspace methods with a marked and significant improvement of the generalization error. The comparison is performed using a large test set of 45 problems from the UCI Machine Learning Repository. A further study on noise tolerance shows that the proposed methods are less affected by class label noise than the standard methods. 2009 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BoostML: An Adaptive Metric Learning for Nearest Neighbor Classification

The nearest neighbor classification/regression technique, besides its simplicity, is one of the most widely applied and well studied techniques for pattern recognition in machine learning. A nearest neighbor classifier assumes class conditional probabilities to be locally smooth. This assumption is often invalid in high dimensions and significant bias can be introduced when using the nearest ne...

متن کامل

FUZZY K-NEAREST NEIGHBOR METHOD TO CLASSIFY DATA IN A CLOSED AREA

Clustering of objects is an important area of research and application in variety of fields. In this paper we present a good technique for data clustering and application of this Technique for data clustering in a closed area. We compare this method with K-nearest neighbor and K-means.  

متن کامل

Boosting the distance estimation: Application to the K-Nearest Neighbor Classifier

In this work we introduce a new distance estimation technique by boosting and we apply it to the K-Nearest Neighbor Classifier (KNN). Instead of applying AdaBoost to a typical classification problem, we use it for learning a distance function and the resulting distance is used into K-NN. The proposed method (Boosted Distance with Nearest Neighbor) outperforms the AdaBoost classifier when the tr...

متن کامل

Performance of Classifier Architectures With The RNADS Feature Space

To evaluate the efficiency of the remote netted acoustic/seismic sensor array (RNADS) [1–6] for classification, we must investigate the performance of various classification algorithms. Currently, the U.S. Army Research Laboratory (ARL) is developing an acoustic/seismic target classifier using a backpropagation neural network (BPNN) algorithm. Various techniques for extracting features have bee...

متن کامل

Hilbert Space Filling Curve (hsfc) Nearest Neighbor Classifier

The Nearest Neighbor algorithm is one of the simplest and oldest classification techniques. A given collection of historic data (Training Data) of known classification is stored in memory. Then based on the stored knowledge the classification of an unknown data (Test Data) is predicted by finding the classification of the nearest neighbor. For example, if an instance from the test set is presen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Expert Syst. Appl.

دوره 36  شماره 

صفحات  -

تاریخ انتشار 2009